Efficient decomposition of quantum gates 
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Optimal implementation of quantum gates is crucial for designing a quantum computer. We con- 
sider the matrix representation of an arbitrary multiqubit gate. By ordering the basis vectors using 
the Gray code, we construct the quantum circuit which is optimal in the sense of fully controlled 
single-qubit gates and yet is equivalent with the multiqubit gate. In the second step of the optimiza- 
tion, superfluous control bits are eliminated, which eventually results in a smaller total number of 
the elementary gates. In our scheme the number of controlled NOT gates is 0(4") which coincides 
with the theoretical lower bound. 
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Since the early proposal of a quantum-mechanical com- 
puter [ij, quantum superposition and entanglement has 
been discovered to be potentially useful for computing. 
For example, Shor's integer factorization and Grover's 
database search [3| show considerable speed-up compared 
to the known classical algorithms. Moreover, the frame- 
work of quantum computing can be used to describe in- 
triguing cntanglcmcnt-rclatcd phenomena, such as quan- 
tum teleportation and quantum cryptography. 

Quantum circuits jj] provide a method to implement 
an arbitrary quantum algorithm. The building blocks of 
quantum circuits are quantum gates, i.e., unitary trans- 
formations acting on a set of qubits. It has previously 
been shown that a general quantum gate can be sim- 
ulated exactly 0, 0, l| or approximately 0, 0] using a 
quantum circuit built of elementary gates which operate 
only on one and two qubits. Some individual gates op- 
erating on n qubits, such as the quantum Fourier trans- 
form, reduce to a polynomial number of elementary gates 
in n. Unfortunately, this is not the case for an arbitrary 
n-qubit gate, i.e., a unitary operation having 4 n degrees 
of freedom. From the practical point of view, the maxi- 
mum coherent operation time of the quantum computer 
is limited by undesirable interactions with the environ- 
ment, i.e., decoherence. On the other hand, the number 
of the elementary gates involved in the decomposition 
governs the execution time of the quantum algorithm. 
Hence the complexity of these quantum-circuit construc- 
tions is of great interest. 

The conventional approach of reducing an arbitrary 
n-qubit gate into elementary gates is given in Ref. 13 
and studied with the help of examples in Refs. 

mini] 

The main idea is to decompose the unitary matrix U, 
which represents the quantum gate, into two-level ma- 
trices and to find a sequence of G^V and C" _1 NOT 
gates which implements each of them. Here we refer with 
C k V to the one-qubit gate V having k control bits. The 
control bits, each of which has the value zero or one, 
specify the subspace in which the gate V operates. This 
2 n ~ fc -dimensional subspace consists of those basis vectors 



for which the values of the controlled qubits match with 
those of the control bits. In this approach, a number of 
C n-1 NOT gates is required to change the computational 
basis, such that the two-level matrix under consideration 
represents the desired C n ~ 1 V gate. 

For the purpose of their physical implementation, all 
the C n ~ 1 V gates can be further decomposed into a se- 
quence of elementary gates, for instance, using the quan- 
tum circuit of Ref. @|. For the simulation of a C" -1 NOT 
or C n ~ 1 V gate, a quantum circuit of 0(n 2 ) elementary 
gates is required while G n ~ 1 W requires only 0(n) gates, 
provided that W is unimodular. In Ref. |5j, it was con- 
sidered that since 0(n) C n_1 NOT gates are needed be- 
tween each of the 0(4") C" -1 !^ gates, the total circuit 
complexity is 0(n 3 4"). It has recently been shown with 
the help of palindromic optimization [l2j , that the num- 
ber of C™ _1 NOT gates required in the simulation can 
be reduced to 0(4 n ) which results in circuit complexity 
0(n 2 4"). A constructive upper bound for the optimal 
circuit complexity has been reported || to be 0(n4 n ) 
\\?\ which may also be achieved by combining the pre- 
vious results % E3 with the fact that C™ _1 NOT gates 
may be replaced with proper CNOT gates upon chang- 
ing the computational basis The theoretical lower 
bound ^ij for the number of CNOT gates needed to sim- 
ulate an arbitrary quantum gate is [(4" — 3n — 1)/4~|. 
However, no circuit construction yielding a complexity 
less than 0(n4") has been reported, nor could be triv- 
ially combined from the previous results. 

In this Letter, we show how to construct a quantum cir- 
cuit equivalent to an arbitrary n-qubit gate. The circuit 
obtained has complexity 0(4") which scales according to 
the predicted theoretical lower bound. The scheme uti- 
lizes the reordering of the basis vectors, i.e., instead of 
labeling the basis vectors through the binary coding, we 
rather employ Gray codes 0. The special property of 
any Gray code basis (GCB) is that only one bit changes 
between the adjacent basis vectors. Hence no C™ _1 NOT 
gates are needed in the decomposition. Furthermore, we 
find that only a small fraction of the control bits appears 
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to be essential for the final result of the decomposition. 
Finally, the further elimination of futile control bits re- 
duces the circuit complexity from 0{nA n ) down to 0(4 ra ). 

The physical state of an n-qubit quantum register can 
be represented with a vector |<£>) in the associated Hilbert 
space C^, where N — 2". In a given basis {|efc)}, a quan- 
tum gate acting on a n-qubit register corresponds to a 
certain 2™ x 2™ unitary matrix U. The QR-factorization 
of any matrix can be performed using the Givens rotation 
matrices A Givens rotation l Gj_k is a two-level ma- 
trix which operates non-trivially only on two basis vec- 
tors, \ej) and |efe). We define l Gj.k — Gj^{A) to be a 
generic rotation matrix which selectively nullifies the ele- 
ment on the i th column and the j th row with the help of 
the element on the i th column and the k th row of a ma- 
trix A. The nontrivial elements of the two-level matrix 
*G ijfc = { l gi, n }^ n= i acting on the matrix A = {a^ n }^ n=1 
are given by 
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while the other elements match with the identity matrix. 
In the special case where the element ajj vanishes, the 
Gives rotation is defined to be an identity matrix. 

For example, the first Givens rotation we employ re- 
sults in 



1 Gn,n-iU 



Ml, 2 



UJV-2,1 UN-2,2 
MJV-1,1 WiV-1,2 
U N .2 



Ul,N \ 



UN~2,N 

un-i,n 
un.n 



where the modified elements of U due to 1 Gn.n-i are 
indicated with the tilde. Applying 1 Gjv-i,jv-2 to the 
modified matrix we can nullify the element ujv-i,i and 
similarly the whole first column, except the diagonal el- 
ement. The definition of the Givens rotation ensures 
that the argument of the diagonal element vanishes and 
the unitarity of the matrix U fixes its absolute value to 
unity. The process is continued through the columns 2 
to N — 1, resulting in an identity matrix, except for the 
diagonal element on the N th row which becomes det(f). 
In fact, without loss of generality we may assume that 
U G SU(2 n ), since the nonzero argument of the deter- 
minant of U contributes only to the global phase of the 
state vector $) which is not measurable. Thus we obtain 
the factorization 



2" 



n n 

i—l j=i-\-l 



l Gnj-l \U = I, 



(1) 



where the order of the products is taken from left to right, 
i.e., the first element 2 -1 G2» 2 n -l is the leftmost matrix 
in the product. The assumption of the unimodularity of 



the matrix U may be dropped if one first applies a matrix 
e -iarg[det(t/)]/iV j ^ich may be realized with a single one- 
qubit gate. 

For quantum computation, it is convenient to choose 
the basis vectors according to \ef.) = ®i \ x i), where 
x\ G {0, 1} and the index i = 1, n refers to the physical 
qubit i. Here we note that the order of the basis vectors 
in the computational basis is not fixed. In the previous 
approaches 0,0, the order of the basis vectors has been 
chosen such that the values x\ essentially form the binary 
representation of the number k— 1, i.e. k = ^ x i- 
However, the coefficients x\ can just as well be chosen 
to form a Gray code [l5| corresponding to the number 



k — 1. A Gray code of n qubits {c™. 



,} is a 



palindrome-like ordering of binary numbers having the 
special property that the adjacent elements c™ and c™ +1 
differ only in one bit from each other. We choose to use 
such a Gray code in which each bit string cf = ¥ n ■ ■ ■ ¥ 2 b\ 
is obtained from the binary representation £j> of the num- 
ber i as c™ = ibXOR(ib/2). Furthermore, we define a 
function to represent the value of the bit string c™ 
plus one, i.e., -f(i) = 1 + Yl?=i K^' ''■ ^ n exam ple of the 
Gray code and the function 7 for the case n = 4 is pre- 
sented in Fig.QJa). 

The advantage of using the GCB instead of the binary 
code basis (BCB) is that a unitary two-level matrix op- 
erating on adjacent basis vectors equals the matrix rep- 
resentation of some G n ^ 1 V gate. Consequently, each of 
the 2™~ 1 (2™ — I) Givens rotations l Gjj_i can be imple- 
mented using only one fully controlled single-qubit gate 
G n ~ 1 V and no C n_1 NOT gates are needed, unlike in pre- 
vious schemes 0, . 

Let us denote the permutation matrix accomplishing 
the transformation of basis from the GCB to the BCB 
by n. Since the conventional basis for the matrix repre- 
sentations is the BCB we rewrite Eq. (Q in the BCB as 



'2™-i 2" 



n n n r- iG m) * * 



rBCB 



= I. 



(2) 



1=1 j=i+l 



Since the matrix n is just a permutation of the basis 
vectors defined by the function 7, Eq. J5J yields 



n n 

i—l j=i-\-X 
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i)^.BCB 
ljr 7(3).7(j-l) 



u 



BCB 



(3) 



It is seen from Eq. © that every Givens rotation 
^7§W(j'-i) ac ^ s nontrivially only on the basis vectors, 
|e 7 (j\) and |e 7 (j_i)}, for which the binary representations 
differ only in one bit. It is also noted that the column 
order of the diagonalization process is changed according 
to the function 7, which was not utilized in Ref. 01, in 
which a fixed column order was assumed in the palin- 
dromic optimization. The decomposition of an arbitrary 
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1: (a) Illustration of the Gray code cf. White squares 
stand for bit values and black squares denote 1. The func- 
tion 7(1) represents the value of the bit string cf plus one. (b) 
Table of dimensions 2 4 x 2 4 shows the number of control bits 
used while nullifying the elements of the matrix U. The width 
of the line L s represents the number p — 3 — s of control bits 
required to zero the element below the line with the element 
above it. 



matrix U in terms of fully controlled single-qubit gates 
may now be constructed straightforwardly according to 
Eq. It also determines the numerical values of the 
generic Givens rotation matrices. The quantum circuit 
for an arbitrary three-qubit gate is shown in Fig. El where 
we have assumed that all the matrices are given in the 
BCB. Since T 3 , k G SU(2), the gates C'^^Tj- k ) decom- 
pose into 0(n) elementary gates. Thus the gate complex- 
ity of this construction is 0(nA n ) which already realizes 
the former upper bound by Knill 9]. 

Let the unitary matrix U be given in the GCB as well 
as the generic Givens rotation matrices l Gj j k, which may 
be realized with a C™ _1 ( l r :/i fc) gate. Since only matrices 
with consecutive indices are needed in the diagonalization 
procedure, we simplify the notation into l Gj := % Gjj-i 
and l Tj := l Tjj-\. If s control bits are removed from 
a C n_1 ( l rj) gate, the matrix representation *G| of such 
an operation is no more two-level, but rather 2 s+1 -level, 
i.e., the matrix l Gj operates with the matrix *Tj to all 
pairs of basis vectors which satisfy the remaining con- 
trol conditions and differ in the same bit bL as the bit 
strings c™ and c™_ 1 . Note that the structure of the Gray 



code assures that the bit strings c™ and 



differ 



no other bits, except the bit number m,j. Our aim is to 
diagonalize the matrix U by p times controlled single- 
qubit gates C p ( l Tj) in the above given order using the 
minimum number of control bits. Once some element 
becomes zero in the diagonalization process, we must use 
control bits in such a way that it does not mix with the 
non-zero elements. 

Let us consider, for example, the diagonalization of 



an arbitrary four-qubit gate, for which the Gray code is 
shown in Fig. ^a). When we are about to perform the 
first rotation 1 Gi6, we may discard all the control bits 
from C 3 ( 1 ri6) and the matrix representation of 1 G\ & be- 
comes 2 x 2-block diagonal. In the implementation of 
1 GJ 5 , we must control the bit number 1, since otherwise 
the matrix would operate on elements in the rows 13 
and 16 which is forbidden since the nonzero element on 
row 13 would mix with the annihilated element on row 
16. In the next step, where we zero the element on row 
and column (14, 1), we may again discard all of the con- 
trol bits, since both elements in the pair {(15, 1), (16, 1)} 
are zero and unaffected by the action of the matrix 1 Ti4, 
while all the other pairs are nonzero and thus allowed to 
mix with each other. Actually, while adjusting the el- 
ement in position (j, 1) to zero, we do not have to use 
upper controls, i.e., no control bits with number greater 
than rrij are needed. When working on the second col- 
umn, we may remove all the upper controls with the re- 
striction that at least one of the control bits must have 
the value 1, since the only non-zero element in the first 
column at position (1,1) is not allowed to mix with any 
other element. To support the determination of the con- 
trol bits required, we produced Fig. ^b) which shows 
the number p of control bits needed for each C p (Tj) 
gate in the whole diagonalization process of the matrix 

u e su{2 4 ). 

Let us assume that we are diagonalizing an arbitrary 
matrix U G SU (N) and aim to annihilate the clement in 
position Provided that j > 2 n ~ 1 and i < 2 n ~ 1 , all 

the upper controls may be dropped except that if i — 1 > 
2™ lj ~ 1 , the bit n with value 1 is also controlled. The 
number of the control bits becomes C % rn . — raj — 1 + Q[i — 
1 — 2 mj ~ 1 ], where the function <d(x) = 1 for x > and 
Q(x) — for x < 0. Let us denote by g^(k) the number of 
C k V gates needed while nullifying the bottom left-hand- 
side quarter of the matrix U and similarly g n (k) for the 
whole diagonalization process. Since the bit m differs 
in the two consecutive bit strings c™ and c"_ x in total 
q m = max(2"~ m_1 , 1) times on rows 2" < j < 2 n ~ 1 , we 
obtain 



9°n(k) -EE ^ Ctn ,k 

= max(2"- 2 ,2' £ ) + e(fc- l)(2 2 "- fe - 2 



(4) 



-,n-2\ 



where S is the Kronecker delta. 

The number of C k V gates needed in the diagonaliza- 
tion process of the top left-hand-side quarter of the ma- 
trix U is <7„_i(fc), while g n -\(k — 1) gates are needed 
for the bottom right-hand-side quarter. This yields a re- 
cursion relation g n {k) = g„(k) + <7„_i(fc) + g n -i(k — 1) 
with the conditions <? m (0) = 2™ l_1 and g m {m) = for all 



FIG. 2: Quantum circuit equivalent to an arbitrary three-qubit quantum gate up to a global phase. The control bits indicated 
with a black square on the upper right hand side corner are superfluous and may be omitted to decrease the complexity of the 
decomposition, while the generic nature of the C fc ( l Pj,fc) gates assures that the result remains invariant. 



TABLE I: Number of CNOT gates and the total number of 
single-qubit and CNOT gates needed for the implementation 
of an arbitrary n-qubit gate in the scheme described. 



n 


1 


2 


3 


4 


5 


6 


7 


8 


9 


CNOT 





4 


64 


536 


4156 


22618 


108760 


486052 


2078668 


total 


1 


14 


136 


980 


7384 


42390 


208820 


944280 


4062520 



m £ {1, 2, . . . , n}. We rewrite the recursion relation as 

n 

g n (n - i) = 2 i_1 + i) + g m -x{m- i). (5) 

m—i+l 

For i — 1 the terms g m -\(m — 1) vanish and the summa- 
tion may be carried out with the help of Eq. yielding 
g n (n - 1) = 3 • 2"- 1 - 2. The general solution of Eq. © 
contains summations and combinatorial factors. Thus, it 
is more convenient to give a simple upper bound 

g n (n-i)<2 n +\ (6) 

Equation © is satisfied when i = 1 and it follows by 
induction using Eq. (JSJ that the upper bound holds for 
all i G {1, 2, . . . , n — 1}. 

To calculate the number of elementary gates, we use 
the decompositions described in Ref. Table U shows 
the number of elementary gates calculated with the exact 
solution of Eq. (0. For large n, the leading contribution 
to the number of CNOT gates is approximately 8.7-4", 
while the upper bound from Eq. yields approximately 
11-4™. 

In conclusion, we have presented a construction which 
provides an efficient way to implement arbitrary quantum 
gates. The initial circuit is optimal in the sense that no 
C n_1 NOT gates are needed to permute the basis vectors. 
Due to the structure of the gate sequence, we are entitled 
to eliminate a considerably large fraction of the control 
bits, which results in a circuit of complexity 0(4"). We 
note that neither one of the two techniques alone, the 
GCB presentation nor the elimination of the control bits 
do not suffice to decrease the circuit complexity from 
0(n4 n ) to 0(4"). 

For certain physical realizations, the implementation 
of the C n ~ 1 V gate is, in principle, straightforward and 
no decomposition into elementary gates is needed [Isj ]. 
To further optimize the design, one could consider the 



possibility of utilizing some tailored multiqubit gates |lflt 
,20], instead of a set elementary gates or to use another 
decomposition for the matrix U than the one into Givens 
rotations. 
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